Skip to content

fix(tests): make cache-lockfile-parity test resilient to leftover deploy artifacts and timestamp drift#1125

Merged
danielmeppiel merged 1 commit intomainfrom
fix/cache-parity-test-cleanup
May 3, 2026
Merged

fix(tests): make cache-lockfile-parity test resilient to leftover deploy artifacts and timestamp drift#1125
danielmeppiel merged 1 commit intomainfrom
fix/cache-parity-test-cleanup

Conversation

@danielmeppiel
Copy link
Copy Markdown
Collaborator

TL;DR

Fixes the tests/integration/test_cache_lockfile_parity.py regression-trap added in #1116, which has been failing on main since merge (run 25287779252 blocks v0.12.1 release CI on macOS arm64 build-validate and integration tests across ubuntu/windows/arm64).

Test-only change. No product code touched -- the cache layer was always deterministic; the fixture was wrong.

Root cause

Two fixture-level bugs caused false positives:

1. Stale deployed-files state leaked between regimes

_reset_install_state removed apm_modules/ + apm.lock.yaml between runs but left the prior run's integration outputs (.github/agents/, .agents/skills/, ...) on disk.

With the lockfile gone, the integrate phase had an empty managed_files set; the leftover files looked like user-authored collisions to BaseIntegrator.check_collision(); they were skipped and never appended to target_paths -- so the new lockfile recorded only the single freshly-written file, not the package's full deployed_files list.

Cold-cache (run A, fresh disk) recorded 5 files. Warm-cache (B) and no-cache (C) recorded 1. Hence the drift.

2. generated_at drifts between independent runs

The lockfile's generated_at field is captured at write time. With no existing lockfile to dedupe against (each fresh project starts empty), every run produces a different timestamp. The byte-identical sha256 assertion was unsatisfiable by construction.

Fix

  1. Clone the project tree per regime via _clone_project() instead of trying to surgically reset state. Each run starts from pristine filesystem state, which is what "same input, same output" actually means for the parity invariant. No hand-rolled cleanup list to keep in sync with future target deploy roots (.cursor/, .claude/, .github/hooks/, .agents/, etc).

  2. Hash the lockfile excluding generated_at in _lockfile_sha. The parity contract is about resolution outcome (resolved_commit, content_hash, deployed_files, package_type, ...) -- not the wall-clock write timestamp. The assertion message and intent are unchanged.

Updated docstring + test docstring to reflect "content-identical (modulo generated_at)" rather than "byte-identical".

Validation

Locally on macOS arm64:

$ GITHUB_TOKEN=$(gh auth token) uv run --extra dev pytest tests/integration/test_cache_lockfile_parity.py -x -v
...
tests/integration/test_cache_lockfile_parity.py::test_lockfile_byte_identical_across_cache_regimes PASSED [100%]
============================== 1 passed in 12.73s ==============================

Lint contract:

$ uv run --extra dev ruff check src/ tests/ && uv run --extra dev ruff format --check src/ tests/
All checks passed!
668 files already formatted

How to verify the root cause

Before this PR, running the test on a fresh checkout produces:

AssertionError: Lockfile drifted between cold-cache and warm-cache runs --
the cache layer is mutating resolution results.

diff of the two lockfiles before the fix showed run A had 5 entries under deployed_files (.agents/skills/style-checker + 4 .github/{agents,instructions,prompts}/* paths) while runs B and C had only .agents/skills/style-checker. After the project-isolation fix, the only remaining difference was generated_at -- which the second part of the fix handles.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…loy artifacts and timestamp drift

The regression-trap test added in #1116 was failing on main, blocking the
v0.12.1 release CI on macOS arm64 build-validate and all 3 OSes' integration
tests jobs.

Two fixture-level bugs caused false positives:

1. Stale deployed-files state across regimes
   _reset_install_state removed apm_modules/ + apm.lock.yaml between runs
   but left the prior run's integration outputs (.github/agents/, .agents/
   skills/, ...) on disk. With the lockfile gone, the integrate phase had an
   empty managed_files set; the leftover files looked like user-authored
   collisions to BaseIntegrator.check_collision(); they were skipped and
   never appended to target_paths -- so the new lockfile recorded only the
   single freshly-written file, not the package's full deployed_files list.
   Cold-cache (run A, fresh disk) recorded 5 files; warm-cache (B) and
   no-cache (C) recorded 1.

   Fix: clone the project tree into a fresh subdir per regime via
   _clone_project(). Each run starts from pristine filesystem state, which
   is what 'same input, same output' actually means for this invariant. No
   hand-rolled cleanup list to keep in sync with new target deploy roots.

2. generated_at always drifts between independent runs
   The lockfile's generated_at is captured at write time; with no existing
   lockfile to dedup against (each cloned project starts empty), every run
   produces a fresh timestamp. The byte-identical assertion was
   unsatisfiable by construction.

   Fix: hash the lockfile excluding the generated_at line. The parity
   contract is about resolution outcome (resolved_commit, content_hash,
   deployed_files, package_type, ...), not the wall-clock write timestamp.

Verified locally with GITHUB_TOKEN=$(gh auth token):
  uv run --extra dev pytest tests/integration/test_cache_lockfile_parity.py
  -> 1 passed in 12.73s

Lint:
  uv run --extra dev ruff check src/ tests/  -> All checks passed
  uv run --extra dev ruff format --check src/ tests/  -> 668 files already formatted

No product code changed -- the underlying cache layer was always
deterministic; only the fixture was wrong.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 3, 2026 19:16
@danielmeppiel danielmeppiel merged commit 41ab121 into main May 3, 2026
19 checks passed
@danielmeppiel danielmeppiel deleted the fix/cache-parity-test-cleanup branch May 3, 2026 19:18
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a regression in the cache/lockfile parity integration test (tests/integration/test_cache_lockfile_parity.py) that was producing false positives due to leftover on-disk deploy artifacts between runs and unavoidable generated_at timestamp drift.

Changes:

  • Update the parity invariant and hashing logic to ignore generated_at when comparing lockfiles across regimes.
  • Run each cache regime against a freshly cloned copy of the project tree to prevent previously deployed integration outputs from affecting subsequent runs.
  • Update module/test docstrings to reflect the revised invariant (content-identical modulo generated_at).
Show a summary per file
File Description
tests/integration/test_cache_lockfile_parity.py Makes the cache-regime parity test resilient by isolating filesystem state per run and excluding generated_at from the lockfile hash.

Copilot's findings

  • Files reviewed: 1/1 changed files
  • Comments generated: 2

Comment on lines 85 to +100
def _lockfile_sha(project: Path) -> str:
"""Hash the lockfile excluding the `generated_at` line.

`generated_at` is a wall-clock timestamp captured at write time, so it
necessarily differs between independent runs. The parity invariant is
about resolution outcome (resolved_commit, content_hash, deployed_files,
package_type, ...), not the write timestamp.
"""
lock = project / "apm.lock.yaml"
assert lock.is_file(), "apm.lock.yaml not produced by install"
return hashlib.sha256(lock.read_bytes()).hexdigest()
canonical = "\n".join(
line
for line in lock.read_text(encoding="utf-8").splitlines()
if not line.startswith("generated_at:")
)
return hashlib.sha256(canonical.encode("utf-8")).hexdigest()
tmp_path: Path,
) -> None:
"""A, B, C must produce byte-identical apm.lock.yaml.
"""A, B, C must produce content-identical apm.lock.yaml (modulo `generated_at`).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants